IEEE/ACM Transactions on Computational Biology and Bioinformatics — Latest Matching Preprints

1

Parameter-efficient deep learning for pneumonia detection on chest X-rays: A comparative evaluation of explainable AI methods

Mahtabi, B.; Nasr-Esfahani, E.; Yaraghi, S.

2026-07-16 radiology and imaging 10.64898/2026.07.14.26358065 medRxiv

Top 0.9%

1.0%

Show abstract

Pneumonia is a leading cause of infectious disease mortality worldwide, accounting for approximately 2.5 million deaths annually and 15% of deaths in children under five. Chest X-ray imaging remains the primary diagnostic tool, but accurate interpretation requires radiological expertise that is disproportionately concentrated in high-income settings, creating a diagnostic gap where disease burden is highest. Automated deep learning offers a scalable complement to specialist-dependent diagnosis, yet clinical adoption requires both high accuracy and transparent, interpretable reasoning. Convolutional neural networks (CNNs) have shown strong potential for pneumonia detection from chest X-rays, but two barriers impede clinical translation: the interpretability of black-box models and the computational feasibility of large architectures in resource-constrained settings. Explainable AI (XAI) methods such as Grad-CAM, Grad-CAM++, and Score-CAM address the interpretability barrier, yet systematic quantitative comparisons across multiple CNN architectures remain scarce. Furthermore, CNN architectures widely used for medical image classification carry high parameter counts that limit feasibility in resource-constrained settings, motivating architectures that achieve competitive accuracy with substantially fewer parameters. Here we propose a parameter-efficient deep learning framework for pneumonia detection based on transfer learning, evaluated across three CNN architectures representing distinct architectural families: EfficientNet-B0 with fine-tuning (proposed method), ResNet50, and DenseNet121, trained under identical conditions on the Kaggle chest X-ray dataset (5,863 images). Our method achieved 90% classification accuracy, outperforming both baselines while requiring 4.8x fewer parameters than ResNet50. To evaluate explainability, Grad-CAM, Grad-CAM++, and Score-CAM were applied across all three architectures and compared quantitatively using Intersection over Union against manually annotated lung segmentation masks, Insertion score, and Deletion score, with pairwise statistical validation via Wilcoxon signed-rank tests and Bonferroni correction. Findings show that classification accuracy and XAI explanation quality must be evaluated independently, and that the proposed parameter-efficient architecture offers a favorable trade-off for resource-constrained clinical deployment.

2

The Variance-Stabilizing Transformation for the Poisson Rate Ratio: Closed-Form Confidence Intervals

Ng, S.-P.

2026-07-18 epidemiology 10.64898/2026.07.16.26358255 medRxiv

Top 1%

0.8%

Show abstract

The incidence rate ratio R is the standard measure for comparing event rates in clinical trials and epidemiology. In vaccine trials, the vaccine efficacy is VE = 1 - R. When events are rare, the two arm counts are Poisson. The estimator of R is heteroskedastic: its sampling variance changes with the data. So no fixed-width interval covers correctly everywhere. The usual log-Wald interval is undefined at zero events and covers poorly at small counts. Early vaccine and drug-safety readouts fall in exactly this regime. We show that a single reparameterization collapses this bivariate problem to an effective one-parameter family with a quadratic variance function, whose variance-stabilizing transformation is 2 arcsinh(sqrt(R)). The reduction yields a closed-form confidence interval for R. Its two leading errors, a curvature bias and the variability of the estimated scale, each admit a closed-form correction with no tuning constants. In a Monte Carlo study of our seven arcsinh variants and five competitors, the +Curve+Stu variant covers within 0.002 of the nominal 0.95 for about 50 control and 5 treatment events. Its width is on par with the best competitor. It avoids the conservatism and zero-count breakdown of log-Wald and MOVER. For moderate counts, we recommend this interval; for sparser data, our Bar-Lev and Enis count-shift variant is more robust. The result is a ready-to-use, closed-form interval for the low-count regime. We illustrate it on early Covid-19 vaccine-efficacy readouts and provide reference implementations in R and Python.

3

Analytical perturbation reveals hidden instability of biological phenotypes

Piorkowska, N. J.; Ostromecki, A.; Franik, G.; Bizon, A.

2026-07-16 endocrinology 10.64898/2026.07.13.26357916 medRxiv

Top 1%

0.6%

Show abstract

Background Unsupervised machine learning has become a cornerstone of computational phenotyping across clinical medicine, genomics, imaging, and multi-omics research. However, phenotype discovery relies on a sequence of analytical decisions - including missing-data handling, preprocessing, dimensionality reduction, clustering methodology, and stochastic initialization - that are rarely evaluated collectively. Although clustering stability has been extensively investigated, the robustness of complete analytical workflows remains largely unexplored. Results We developed an Analytical Perturbation Framework that systematically quantifies the robustness of phenotype discovery by perturbing complete unsupervised learning workflows rather than individual clustering algorithms. Using a real-world cohort of 1,286 women with polycystic ovary syndrome (PCOS), we generated 116 valid analytical pipelines comprising alternative preprocessing strategies, missing-data handling methods, dimensionality reduction approaches, clustering algorithms, and random initializations. Agreement between independently generated phenotype solutions was consistently low (median Adjusted Rand Index = 0.079), indicating substantial sensitivity of phenotype discovery to routine analytical decisions. Variance decomposition identified preprocessing as the largest contributor to phenotype instability (22.8%), followed by clustering methodology (14.6%), whereas stochastic initialization explained only 3.1% of the observed variability. At the patient level, most individuals exhibited reproducible phenotype assignments (median Patient Robustness Score = 0.719), although a substantial subgroup showed markedly lower assignment stability. Feature perturbation analyses identified follicle-stimulating hormone, anti-thyroglobulin antibodies, anti-thyroid peroxidase antibodies, total testosterone, luteinizing hormone, and androstenedione as the strongest contributors to computational robustness, rather than biological importance. Finally, phenotype solutions demonstrating greater computational robustness also exhibited greater biological coherence during independent validation.

4

LocusBlend: Flexible multi-index regional visualization of genomic association signals

yang, c.; Cook, N.; Zeng, Y.; Fu, T.; budde, J.; Cruchaga, C.; Belloy, M. E.

2026-07-21 genetic and genomic medicine 10.64898/2026.07.15.26358129 medRxiv

Top 3%

0.2%

Show abstract

Summary It has become standard practice to visualize regional signals from genomewide association studies GWAS using LocusZoom plots Similarly GWAS signals are compared to regionally matched quantitative trait loci QTLs ie varianttogene regulation data using LocusCompare plots to aid assessment of candidate traitrelated genes Despite broad usage these tools annotate variants by linkage disequilibrium LD to a single lead or index variant This singleindex representation has limitations for visualizing complex loci that contain multiple independent signals We present LocusBlend an interactive web application for multiindex LDblended visualization of genomic loci LocusBlend supports one or two genomic association summarystatistic datasets and one to three index variants multiindex LocusZoom colorblended plots and matching LocusCompare visualizations Applications to Alzheimers disease GWAS and QTL signals illustrate LocusBlend enables visualization and separation of independent signals despite shared LD and high genomic complexity Overall LocusBlend is aimed at supporting researchers handle the continuously expanding complexity of human genomics findings Availability and Implementation LocusBlend is freely available at httpslocusblendwustledu Publication ready plots are generated in 1min Source code documentation example datasets input templates and reproducibility instructions are available at httpsgithubcomBelloyLabLocusBlend LocusBlend is implemented in Python using Streamlit Plotly and PLINK Supplementary Information Supplementary data are available online

5

Validating Artificial Intelligence Guidance for Ultrasound Acquisition and Remote Interpretation

Maldonado, T.; Muluk, S.; Rali, P.; Soni, N.; Nathanson, R.; Kuttab, H.; VandeHei, M.; Michels, C.; Swietlik, J.; Speranza, G.; Schaffer, O.; Collaborating Investigators Group, ; Al Noor, F.; Mischkewitz, S.; Kainz, B.; Blaivas, M.; Jacobowitz, G.

2026-07-19 radiology and imaging 10.64898/2026.07.16.26356882 medRxiv

Top 3%

0.2%

Show abstract

Background: Venous thromboembolism (VTE), including deep vein thrombosis (DVT), remains a major global health burden. Diagnostic pathways rely on ultrasound but are limited by availability and prolonged time-to-imaging. Novel artificial intelligence (AI) guidance systems have been designed to enable non-ultrasound-trained operators to acquire proximal lower extremity compression ultrasounds for remote clinician interpretation. Methods: This multicenter, double-blinded, prospective, nonrandomized study evaluated the performance of an AI guidance system (ThinkSono Guidance, ThinkSono, GmbH). Patients underwent AI-guided ultrasound(s) and standard of care ultrasound(s). Primary and secondary endpoints were image quality, sensitivity and specificity for proximal DVT, and prioritization specificity, a measure of specificity in identifying patients requiring standard of care ultrasound after AI-guided scan. Results: Of 634 recruited subjects, 594 were analyzed, with 67 DVTs across 700 scans. 86.83% of AI-guided scans achieved diagnostic image quality. Triage sensitivity was 92.86%, triage specificity 39.12%, prioritization specificity 97.96%. Standard of care ultrasounds could be avoided in 35.32% of patients. Total median AI-guided scan and review time was 7.57 minutes. Conclusions: Clinician-reviewed AI-guided scans were rapid, sensitive for DVT, and specific for prioritizing patients requiring standard of care ultrasounds. These findings suggest AI-guided ultrasound may be a scalable triage strategy to expand DVT evaluation access, particularly in resource-constrained and after-hours settings

6

CuGen: A GPU-accelerated framework for large-scale genomics

Kiiskinen, T.; Richland, J.; Wang, W.; Lu, W. S.; Balasubramanian, N.; Hastie, T.; Tibshirani, R.; Rivas, M. A.

2026-07-17 genetic and genomic medicine 10.64898/2026.07.15.26358178 medRxiv

Top 4%

0.1%

Show abstract

Biobank-scale genomic analyses remain computationally expensive, CPU-bound workflows, particularly when adjusting for confounding. Here, we present CuGen, a GPU-accelerated framework for large-scale genomics. CuGen uses UltraLasso, a novel hierarchical application of univariate-guided sparse regression (uniLasso), to select a compact, phenotype-informed active set of fewer than 30,000 variants. This achieves robust leave-one-chromosome-out (LOCO) confounding control, enabling both downstream GWAS and in-sample fine-mapping. Additionally, we introduce the .cugen file format, a genotype representation designed for memory-optimized, high-throughput streaming and random access on GPU hardware. Building on this substrate, we provide a general GPU-accelerated genomics toolkit handling polygenic prediction, data manipulation, quality control, analysis, and visualization. We demonstrate CuGen's efficacy in the UK Biobank with up to 408,624 individuals, where the full GWAS pipeline and fine-mapping against 6.8 million imputed variants completes in approximately 10 minutes on a single high-throughput GPU with 80 GB of memory. The pipeline scales efficiently to massive phenome-wide analyses with sublinear resource consumption.

7

Across-Site MRI Prediction of Substantial Lymphovascular Space Invasion in Endometrial Cancer: Radiomics versus Deep Learning Features

Di Giovanni, D. A.; Tanaka, A.; Horikoshi, T.; Tsuboyama, T.; Yokota, H.; Zakarian, R.; Matsumoto, Y.; Vallieres, M.; Reinhold, C.

2026-07-16 radiology and imaging 10.64898/2026.07.14.26358100 medRxiv

Top 4%

0.1%

Show abstract

Purpose: To compare the cross-site generalization of radiomic features and deep learning embeddings for MRI prediction of substantial lymphovascular space invasion (LVSI) in endometrial cancer. Materials and Methods: This retrospective two-center study included 206 women (mean age, 59.8 years) with endometrial cancer who underwent preoperative 3-T MRI from March 2016 to March 2023. Hospital A (n = 130) was used for development and Hospital B (n = 76) for strict external testing. T2-weighted, reduced field-of-view diffusion-weighted, and apparent diffusion coefficient images were manually segmented. Radiomic features and seed-pooled embeddings from 3D ResNet18, DenseNet121, and U-NEXtractor were modeled with elastic-net logistic regression or XGBoost. Out-of-fold Platt calibration and sensitivity-targeted thresholds were estimated using development data only. AUCs were summarized with 95% bootstrap confidence intervals. Results: External radiomics with elastic-net achieved an AUC of 0.609 (95% CI: 0.464, 0.740) and sensitivity of 0 of 12 (0%). DenseNet121 with elastic-net had the highest external AUC (0.685; 95% CI: 0.538, 0.822) but sensitivity of 3 of 12 (25%). U-NEXtractor with elastic-net detected 10 of 12 positive cases (83.3%) with specificity of 32 of 64 (50.0%) and balanced accuracy of 0.667. XGBoost showed higher apparent development performance but weaker external operating behavior. Conclusion: Under real-world cross-site MRI acquisition shift, DenseNet121 and U-NEXtractor embeddings showed better external generalization than handcrafted radiomic features for substantial LVSI prediction.

8

Efficient stochastic epidemic simulation via the Sellke construction

van Boven, M.; Bootsma, M. C.

2026-07-17 epidemiology 10.64898/2026.07.16.26358219 medRxiv

Top 4%

0.1%

Show abstract

Stochastic epidemic models are a cornerstone of infectious disease epidemiology and are often used to study intervention scenarios. However, large run-to-run variability can make intervention effects difficult to estimate precisely. We revisit the epidemic Sellke construction, which assigns each individual an infection threshold for the cumulative infection hazard such that, conditional on the thresholds, the epidemic trajectory becomes deterministic. This enables coupling of simulations with and without an intervention, yielding low-variance effect estimates even when outcomes such as final size or peak incidence vary widely between runs. We develop an exact, event-driven implementation that maintains infection and recovery events in priority queues. Cumulative infection-hazard updates require O(log N) time per event, yielding overall complexity O(Elog N) for E events in a population of size N. The implementation achieves computational performance comparable to the classical Gillespie algorithm while naturally accommodating non-Markovian infectious periods and complex infectiousness profiles. We illustrate the approach using distance-dependent spread of avian influenza between poultry farms in the Netherlands and a multilayer population with households, schools, and workplaces. In both examples, coupling enables efficient within-run comparisons of intervention scenarios across stochastic realisations.

9

Diagnostic Accuracy of MRI Radiomics for Predicting KRAS Mutation in Rectal Cancer: A Systematic Review and Meta-analysis

Saleh, M. M.; Hegazy, M.; Alsaied, M. A.; Elkenani, A. J.; Ehab, R.; Hesham, M.; Abdelrazek, H. M.; Nazemi, S.; Shalaby, M.; El-Hussuna, A.

2026-07-20 radiology and imaging 10.64898/2026.07.17.26358357 medRxiv

Top 5%

0.1%

Show abstract

Background: KRAS mutation status is an important biomarker in rectal cancer, with implications for prognosis and treatment response. MRI-based radiomics has emerged as a non-invasive approach for predicting tumor genotypes. However, the diagnostic performance of MRI radiomics for predicting KRAS mutation status remains unclear. This study aimed to evaluate the diagnostic accuracy of MRI radiomics for predicting KRAS mutations in rectal cancer. Methods: A systematic search of PubMed, Cochrane Library, Scopus, and Web of Science was performed through July 2025. Diagnostic test accuracy studies evaluating MRI-based radiomics or artificial intelligence models for predicting KRAS mutation status in adult patients with rectal cancer were included, using molecular testing as the reference standard. Risk of bias was assessed using the QUADAS-2 tool. Pooled sensitivity and specificity were estimated using a bivariate random-effects model. Results: Seven studies involving 1,224 patients were included. The pooled sensitivity was 0.736 (95% CI: 0.697-0.772) and the pooled specificity was 0.645 (95% CI: 0.586-0.701). The false positive rate was 0.355 (95% CI: 0.299-0.414). The area under the hierarchical summary receiver operating characteristic curve was 0.754, with a normalized partial AUC of 0.666. Between-study heterogeneity ranged from low to moderate depending on the estimation method (I2 = 8.4%-53.3%). Conclusion: MRI radiomics demonstrates moderate diagnostic accuracy for predicting KRAS mutation status in rectal cancer and may serve as a promising non-invasive biomarker for preoperative molecular stratification. Further large-scale studies with external validation are required to confirm its clinical utility.

10

Statistical Inference and Power Analysis for Comparative F1 and Fβ Scores under Correlated Classifier Pairs

Hsu, C.-Y.; Liu, Q.; Shyr, Y.

2026-07-17 dermatology 10.64898/2026.07.15.26358166 medRxiv

Top 5%

0.1%

Show abstract

As machine learning and artificial intelligence systems are increasingly used in healthcare, rigorous evaluation of their classification performance has become critical. The F1 and F{beta} scores are widely adopted metrics for assessing performance in imbalanced biomedical data. Recently, we introduced psF1, a unified statistical framework for inference and study design for single and comparative F1 and F{beta} scores under the assumption of independent classifiers. In practice, however, benchmarking two classifiers on the same dataset creates a correlated paired setting. Ignoring this intrinsic dependency leads to overestimation of the standard error and a substantial loss of statistical power. To address this, we develop psF1pair, an advanced framework for statistical inference and power analysis that explicitly accounts for correlations between classifier pairs. Extensive simulation studies demonstrate the performance of psF1pair, and its utility is further illustrated through application to a real-world imaging classification system. As expected, higher correlation between classifiers yields narrower confidence intervals and enhanced statistical power. A freely available R package is provided to facilitate implementation, supporting accurate evaluation and study design for predictive and classification models in biomedical research.

11

Privacy-Preserving Matching for Federated Causal Inference in Multicentre Patient Cohorts

Gusinow, R.; Morgan, A. S.; Canziani, L. M.; Zeitlin, J.; Kim, M.; Gentilotti, E.; Ghosn, J.; Florence, A.-M.; Tami, A.; Toschi, A.; Palacios-Baena, Z. R.; Tacconelli, E.; Hasenauer, J.

2026-07-19 epidemiology 10.64898/2026.07.16.26358171 medRxiv

Top 5%

0.1%

Show abstract

Causal effect estimates can often be biased in clinical and epidemiological studies as patient cohorts frequently exhibit substantial covariate imbalances between treated and control groups, often amplified in multicentre studies due to heterogeneous recruitment, clinical practice, and case mix. Covariate balancing methods are therefore essential for valid causal inference. However, their application becomes challenging when data are distributed across cohorts and cannot be pooled because of privacy, legal, or institutional constraints, leaving a gap in practical methods for causal effect estimation in federated and imbalanced clinical data settings. We develop a privacy-preserving framework for covariate balancing and causal effect estimation across distributed data providers, combining federated aggregation with differential privacy to enable propensity score subclassification and matching without sharing individual-level records. Matching relies on non-disclosive quantities and differentially private distance evaluation, and the resulting matched subsets remain local to each server. Balance can be assessed through federated diagnostics and privacy-preserving visualisations, and we provide secure estimators for average treatment effects with associated uncertainty quantification. We implement this framework in the DataSHIELD federated analysis platform via 2 R packages. In simulations, we demonstrate agreement between federated and centralised analyses in the absence of privacy noise and quantify the bias--variance trade-offs induced by differential privacy. We illustrate applicability in two multinational settings-a Long COVID cohort and very preterm birth cohorts-showing that the approach enables practical causal analyses under real-world data protection constraints. The DataSHIELD packages are available on Github. Additional methodological details are provided in the Supplementary Material.

12

Neonatal admission as a marker of risk for poor educational attainment and special educational needs in children aged 5-11 years

John, A.; Pike, C.; Olga, L.; Sovio, U.; Wong, H. S.; Smith, G. C.; Aiken, C.

2026-07-17 pediatrics 10.64898/2026.07.15.26358132 medRxiv

Top 6%

0.0%

Show abstract

Background: Children born prematurely (before 37 weeks) or admitted to the neonatal unit (NNU) are at increased risk of adverse long-term physical health outcomes. It is also recognised that there is an association with later academic performance and special educational needs, however it is not clear whether these broad risk factors could be used as stand-alone heuristics to identify children who may benefit from additional support in educational settings. We aimed to examine the associations between neonatal unit (NNU) admission and educational attainment in mid-childhood. Methods and Findings: Pregnancy data from a prospective birth cohort (Pregnancy Outcome Prediction Study, Cambridge, United Kingdom, 2008-2012) were linked to national educational outcomes (Department for Education, United Kingdom). Multivariable regression models adjusted for maternal, child, and socioeconomic factors were used to evaluate associations between (i) all NNU admissions, (ii) at term NNU admissions >48 hours, (iii) preterm birth without ongoing physical health needs, and educational outcomes at ages 5-11 years. Children who required any NNU care were more likely not to meet expected educational standards across multiple ages and domains in early and mid-childhood: age 5 early year foundation (aOR 1.64, 95% CI 1.19-2.27, p=0.003), phonics at age 6 (aOR 2.43, 95% CI 1.72-3.57, p<0.001), and at age 7 (here assessments were divided into multiple domains): reading (aOR 1.67, 95% CI 1.18-2.38, p=0.004), writing (aOR 1.72, 95% CI 1.25-2.38, p<0.001), mathematics (aOR 1.56, 95% CI 1.09-2.22, p=0.020), and science (aOR 1.85, 95% CI 1.22-2.78, p=0.003). Similar patterns were observed among both at term-born infants who stayed >48hrs in NNU (phonics assessment at age 6 aOR 2.26, 95% CI 1.51-3.36, p<0.001) and in children born preterm without long-term physical health sequelae (phonics assessment at age 6 aOR 3.07, 95% CI 1.96-4.81, p<0.001). These associations were robust to adjustment for demographic, perinatal, and socio-economic factors. By age 11, differences in academic attainment were attenuated and no longer clearly distinguishable across all exposure groups. However, there was an increased likelihood of special educational needs (SEN) at age 11 associated with any NNU admission (aOR 1.78, 95% CI 1.15-2.73, p=0.009), at term NNU admission for >48hrs (aOR 1.88, 95% CI 1.19-3.00, p=0.007), and children born preterm without long-term physical health sequelae (aOR 1.50, 95% CI 1.00-2.25, p=0.049). Predictive performance of any NNU admission for SEN at age 11 was moderate (AUC 0.70, 95% CI: 1.14-2.65, p=0.010), with balanced sensitivity and specificity and high negative predictive value. Conclusions: NNU admission, for both term and preterm infants, is associated with poorer educational outcomes and an increased likelihood of special educational needs in mid-childhood.

13

Bridging surveillance gaps in dengue: a hierarchical model integrating mixed data sources for transmission estimation and vaccine targeting

Djaafara, B. A.; Elyazar, I. R.; Yosephine, P.; Surya, A.; Silalahi, F. S.; Handito, A.; Thohir, B.; Aryani, D.; Gunawan, D.; Nisa, A. K.; Prianto, E.; Samad, I.; Cook, A. R.; Huang, A. T.; Clapham, H. E.; Bhatt, S.; Mishra, S.

2026-07-17 epidemiology 10.64898/2026.07.15.26358208 medRxiv

Top 6%

0.0%

Show abstract

Estimating dengue force of infection (FOI) is essential for understanding transmission dynamics and targeting intervention programmes, yet surveillance data in endemic settings required for estimations are often incomplete, with varying formats. We developed a Bayesian hierarchical catalytic model that jointly fits age-stratified case data, aggregate case data, and seroprevalence surveys within a single framework, incorporating external covariates to improve parameter identifiability. Synthetic validation showed that covariates alone recovered accurate FOI point estimates even when most districts contributed only aggregate data, but did so with poorly calibrated uncertainty; anchoring the model with a single seroprevalence survey was necessary to bring credible interval coverage close to nominal. Applied to 128 districts across Java and Bali, Indonesia (2016-2024), the model revealed substantial spatial heterogeneity in FOI and reporting rates. Many districts in Java exceeded the WHO-suggested seroprevalence threshold for vaccine introduction, yet were classified as low-priority when using reported incidence as prioritisation criterion, particularly in areas with weak surveillance. Model-based seroprevalence estimation, integrating multiple data sources, offers a more consistent basis for identifying high-priority districts for vaccine introduction, and is less susceptible to surveillance bias than reported incidence.

14

General Practice Perspectives on Post-Infection Conditions: Scoping Review and UK Survey

Aung, K. W.; Scuffell, J.; Podlasek, A.; Engamba, S.; Jones, F.; Edwards, A.; Chew-Graham, C. A.; Sanyaolu, L.; Busse-Morris, M.

2026-07-17 primary care research 10.64898/2026.07.15.26358157 medRxiv

Top 6%

0.0%

Show abstract

Background Post-infection conditions (PICs), such as Long Covid, are associated with heterogeneous, fluctuating symptoms that profoundly affect daily functioning. Despite moderate-certainty evidence from the NIHR-funded LISTEN trial (COV-LT2-0009) that personalised self management support improves outcomes and may reduce societal and economic impacts of Long Covid, many people living with PICs still receive condition-specific services, generic advice, or stand-alone digital tools that do not address their complex needs. Aim To map care approaches in general practice and synthesise UK evidence for PIC management. Design and setting Scoping review and online survey. Method A two-phase study was conducted: (1) a scoping review of UK evidence on PIC management in general practice; and (2) a supplementary online survey of practitioners working in UK general practice to provide contextual insights. Results The scoping review identified 32 studies focused on Long Covid. One study included a comparator group (ME/CFS). Study populations were predominantly white ethnicity and female. Evidence for non-Covid PICs in UK general practice was largely absent. The supplementary survey (n=46) provided preliminary practice-level insights. Healthcare practitioners reported varied PIC presentations, diagnostic uncertainty, limited referral pathways, inequitable access, and low confidence in managing PICs. Conclusion Evidence informing PIC management in UK general practice remains predominantly Long Covid-focused and may not reflect the range of PICs encountered in practice. While survey findings are preliminary and require confirmation in larger samples, they highlight uncertainty around PIC management. Further research is needed to evaluate whether existing Long Covid pathways should be expanded or complemented by broader PIC models. Keywords general practice; Long Covid; self-management; post-viral syndromes

15

Multilevel Factors Associated with Nonresponse to Patient-Reported Outcome Measures in Routine Radiation Oncology Care

Liu, J. B.; Chen, Y.-J.; Edelen, M. O.; Pusic, A. L.; Martin, N. E.; Zeng, C.

2026-07-17 health systems and quality improvement 10.64898/2026.07.15.26358162 medRxiv

Top 6%

0.0%

Show abstract

Purpose: Nonresponse to routinely collected patient-reported outcome measures (PROMs) threatens the representativeness of aggregated data. We characterized patient-, provider-, and clinic-level factors associated with PROMIS Global-10 nonresponse in routine radiation oncology care. Methods: In this retrospective cohort study, all adults seen at five Mass General Brigham radiation oncology clinics over one year were included. The primary outcome was patient-level nonresponse, defined as never completing the portal-administered Global-10 versus completing it at least once. Using iterative mixed-effects logistic regression, we modeled patient-, provider-, and clinic-level factors. Results: Among 12,214 patients, 71 providers, and five clinics, patient- and appointment-level response rates were 35.4% and 10.9%, with patient-level response ranging nearly fivefold across clinics (12.8% to 66.2%). In Model 1, male sex, lower education, not working, and recent surgery had higher odds of nonresponse, and longer time since diagnosis lower odds. After provider- and clinic-level factors were added, patient sex, education, and employment became nonsignificant, whereas recent surgery (adjusted odds ratio [aOR] 1.97) and longer time since diagnosis (aOR 0.46 for >12 months) persisted. A provider's historical collection rate was protective but attenuated at the clinic level. There, a later program launch (aOR 0.29) and higher historical collection rate (aOR 0.79) correlated with lower nonresponse, whereas academic versus community setting did not. Conclusions: Nonresponse to routinely collected PROMs is a multilevel phenomenon driven substantially by clinic-level implementation factors, not patient characteristics alone. Because response rate is only a proxy for representativeness, PROMs programs and PRO-based performance measures should prioritize representative collection over volume.

16

Genome-Wide Association Studies and Deep-Learning Functional Annotation of Opioid Use Disorder across Three Ancestries in the All of Us Research Program

Gu, S.; Petrovitch, D.; Hall, O. T.; Lambert, J. W.; Kember, R. L.; Nahid, N. A.; Ma, Q.; Sprague, J. E.; McDonough, C. W.; Johnson, J. A.

2026-07-17 addiction medicine 10.64898/2026.07.15.26358096 medRxiv

Top 6%

0.0%

Show abstract

Background: Opioid use disorder (OUD) is heritable, yet most genome-wide association studies (GWAS) have focused on European populations, leaving the genetic architecture of OUD in non-European populations underexplored. Methods: We conducted GWAS of OUD across three ancestries using electronic health records and genomic data from 52,357 All of Us Research Program participants (8,912 cases; 43,445 matched opioid-exposed controls; 48.5% female). Participants were stratified into European (EUR), African (AFR), and Admixed American (AMR) ancestry groups for logistic regression GWAS, with independent replication in the Million Veteran Program. We then applied the deep-learning model AlphaGenome to predict the tissue-specific transcriptomic and splicing consequences of top risk variants across 13 reward-pathway brain regions. Results: We identified and replicated a novel DDX6 risk locus, alongside established OPRM1 and FURIN signals. AlphaGenome predicted the DDX6 regulatory allele downregulates the stress-resistance gene FOXR1 in the nucleus accumbens, while the protective OPRM1 variant (rs1799971) upregulates OPRM1 expression across reward networks. Other signals of interest included IL6R and SHISA9 (EUR); GHR (AFR); and ASTN2 (AMR). Conclusions: This study identifies DDX6 as a novel OUD risk locus, replicates associations with OPRM1 and FURIN, and highlights biologically plausible ancestry-specific signals in AFR and AMR populations. We also replicated top variants in an independent population. Finally, integrating GWAS with deep-learning annotations provides specific, localized biological hypotheses to guide future experimental validation and targeted therapeutics.

17

Complex intra-host SARS-CoV-2 evolution following monoclonal antibody pre-exposure prophylaxis

Kamelian, K.; Pascall, D. J.; Cheng, M. T. K.; Meng, B.; Altaf, M.; Morse, R. M.; Aggio, J. B.; Egan, D. J. S.; Chen-Xu, M.; Trivioli, G.; Sutton, B.; Richter, A.; Gonzalez-Vazquez, L. D.; Cormie, C.; Kemp, S.; Yeadon, R.; Hyatt, B.; Wong, A.; Thesin Pelamkulangara, N.; Fraser, E.; McCarthy, B.; Novaes, F.; Stott, S.; Galvin, A.; Bellis, K. L.; De Angelis, D.; Harrison, E. M.; Martin, D.; Smith, R. M.; Gupta, R. K.

2026-07-17 infectious diseases 10.64898/2026.07.14.26356329 medRxiv

Top 6%

0.0%

Show abstract

Background: Monoclonal antibodies have emerged as a prophylactic strategy to prevent symptomatic SARS-CoV-2 infection in immunocompromised individuals. However, the evolutionary and clinical implications of breakthrough infections under this regime remain unclear. Methods: A male in their 80s with a haematological/oncological diagnosis received a 2000 mg intravenous infusion of sotrovimab in March 2023 and was diagnosed with COVID-19 by RT-qPCR from a nasopharyngeal swab in August 2023. Weekly samples (n=24) were collected through February 2024 (171 days). All samples underwent whole-genome sequencing, with select mutations subjected to functional assessment. Findings: Sequencing identified the GE.1 lineage at all timepoints. An intra-host recombination event in ORF1ab (positions 8942-12458) was detected prior to 23 weeks post-detection, followed by a 14-fold increase in viral load (7.42e+06 to 1.00e+08 RNA copies/mL) and a marked shift in the viral population. E340D, a sotrovimab resistance mutation, was detected at low abundance (46%) within the first week post-infection, fluctuated over time, and was nearly fixed by week 15 (107 days) post-detection. We assessed five spike mutations - V36M, S98F, and V213G in the N-terminal domain, Y505P in the receptor-binding domain, and P681Q near the S1/S2 cleavage site - and additionally evaluated the impact of E340D. V36M conferred the highest infectivity across all cell lines, with the most significant effect in low-TMPRSS2 cells. While all mutations showed enhanced infectivity with the addition of E340D, the effect was most pronounced in mutations with lower baseline infectivity. The addition of E340D significantly decreased relative neutralizing titres for V36M, S98F, and V213G, enabling escape from neutralizing antibodies in XBB-responsive individuals, illustrating an enhanced phenotypic advantage. Patient neutralizing activity was absent pre-sotrovimab, and sotrovimab-induced neutralization was further compromised by selection of E340D. Interpretation: Sotrovimab pre-exposure prophylaxis in an immunocompromised patient did not prevent SARS-CoV-2 infection, and selected for resistant mutation E340D, with unexpected fitness consequences across non-receptor binding domain spike regions.

18

Temporal relationships between distress and pain in people living with HIV

Arendse, G.; Kamerman, P.; Wadley, A.; Edwards, R. R.; Joska, J.; Parker, R.; Madden, V. J.

2026-07-17 primary care research 10.64898/2026.07.15.26358133 medRxiv

Top 6%

0.0%

Show abstract

Objective: There is a bidirectional relationship between emotional distress and pain. However, this relationship is understudied in people with HIV in low-resource settings. This study sought to describe the temporal relationship between emotional distress and pain in people with HIV. Design: Longitudinal observational study. Methods: Participants with virally suppressed HIV, reporting either no pain or persistent pain at baseline, provided weekly remote ratings of distress, worst pain, and average pain using 0-10 visual analogue scales. Within-individual fluctuations in distress and pain were visualised over time. Group-level correlations were determined using Spearman's correlation tests. Cumulative link mixed models assessed whether distress and pain each predicted the other in the following week. Results: 72 participants provided responses over 49 weeks. The participants had a median (IQR) age of 43 (37-51) years, 63% (n=45) were unemployed and most were females (n=51;71%). Distress and pain fluctuated concurrently within individuals: distress was positively correlated with worst pain ({rho}=0.66, 95% CI= 0.60-0.72, p<0.001) and average pain ({rho}=0.70, 95% CI=0.64-0.75, p<0.001) intensity within the same week. Worst pain (OR=1.42, 95% CI=1.17-1.71, p<0.001) and average pain (OR=1.43, 95% CI=1.20-1.71, p<0.001) intensity both predicted distress in the next week. Distress predicted worst pain intensity (OR=1.25, 95% CI=1.07-1.46, p=0.023) but not average pain intensity (OR=1.19, 95% CI=1.01-1.40, p=0.152) in the next week. Conclusions: The temporal relationship between distress and worst pain intensity was bidirectional, whereas distress did not temporally predict average pain intensity. Both pain and emotional distress should receive attention from HIV research and clinical care in low-resource settings.

19

Trends and variations in Lithium usage across care settings in England between 2015-2024

Schiffer, H.; Fisher, L.; Curtis, H. J.; Wood, C.; Brown, A. D.; Bacon, S. C.; Croker, R.; Goldacre, B.; MacKenna, B.; Speed, V.; Macdonald, O.

2026-07-17 psychiatry and clinical psychology 10.64898/2026.07.15.26357641 medRxiv

Top 6%

0.0%

Show abstract

Lithium has been the gold standard for the treatment and prevention of relapse in bipolar disorder for over 60 years. Guidance from the National Institute for Health and Clinical Excellence states explicitly to 'offer lithium as a first-line, long-term pharmacological treatment for bipolar disorder'. Yet, in the last two decades its use has been in decline with clinicians favouring anticonvulsants or antipsychotics when treating this condition. In this study, we have used three openly available datasets containing prescribing data from primary and secondary care to explore trends in the use of lithium in England, showing both regional and temporal variance between 2015-2024. We have shown that lithium use declined in primary care by 20.9% in the last ten years (2015-2024) and 10.9% overall in the last five years (2019 to 2025). We have also shown how there is some regional variation in the source of lithium for patients, although the vast majority is prescribed in primary care. Further research into clinical behaviour is needed to understand what is driving the decrease in lithium usage, and what barriers and enablers may influence its use across the country.

20

Human GPR174 deficiency drives polyclonal lymphoproliferative disease via defects in T cell function

Huang, Y.-H.; Arana, K.; Rachimi, S.; Tam, H.; Spegarova, J. S.; Engelhardt, K. R.; Griffin, H.; Mee, M.; Miano, M.; Raggi, F.; Grossi, A.; Rusmini, M.; Ceccherini, I.; Dell'Orso, G.; Ferro, J.; Giarratana, M. C.; Pillai, V.; Banka, S.; Garcez, T.; Briggs, T. A.; Mellouli, F.; von Hardenberg, S.; Beier, R.; Auber, B.; Baumann, U.; Tawamie, H.; Behrens, E.; Oldridge, D. A.; Cabrera, E. C.; Xu, Y.; Ouyang, S.; Hambleton, S.; Romberg, N.; Cyster, J. G.

2026-07-17 rheumatology 10.64898/2026.07.14.26357774 medRxiv

Top 6%

0.0%

Show abstract

The X-linked G-protein coupled receptor GPR174 is highly expressed in T and B lymphocytes and has immunoregulatory roles in mice, but its function in humans is unknown. We describe a cohort of six individuals who have function-disrupting variants in GPR174 and a clinical phenotype of lymphadenopathy and autoimmunity. Histological analysis of two patient lymph nodes revealed necrotizing lymphadenitis and lymphoproliferation resembling Kikuchi-Fujimoto disease. In-depth analysis of three patients and related carriers revealed overaccumulation of CD8 terminally differentiated effector memory cells re-expressing CD45RA (TEMRA). Patient cells and GPR174-deficient CD8 T cells generated from controls showed less repression of proliferation by the GPR174 ligand lysophosphatidylserine (lysoPS) and an effector-biased gene expression program. GPR174-deficient CD4 T cells were resistant to lysoPS-mediated suppression of IL2 production. In mice, chronic viral infection led to over-accumulation of GPR174-deficient effector CD8 T cells. We describe an inborn error of immunity associated with dysregulated lymphocyte responses that we propose predisposes to exaggerated lymphoproliferation and autoimmunity following viral infection.